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Abstract 


To improve the taxonomy and systematics of Porcellanidae within the evolution of 
Anomura, we describe the complete mitochondrial genomes (mitogenomes) sequence of 
Pisidia serratifrons, which is 15,344 bp in size, contains the entire set of 37 genes and has 
an AlT-rich region. Compared with the pancrustacean ground pattern, at least five gene 
clusters (or genes) are significantly different with the typical genes, involving eleven tRNA 
genes and four PCGs and the tandem duplication/random loss and recombination models 
were used to explain the observed large-scale gene re-arrangements. The phylogenetic 
results showed that all Porcellanidae species clustered together as a group with well nodal 
support. Most Anomura superfamilies were found to be monophyletic, except Paguroidea. 
Divergence time estimation implies that the age of Anomura is over 225 MYA, dating back 
to at least the late Triassic. Most of the extant superfamilies and families arose during the 
late Cretaceous to early Tertiary. In general, the results obtained in this study will contribute 
to a better understanding of gene re-arrangements in Porcellanidae mitogenomes and 
provide new insights into the phylogeny of Anomura. 


© lu J et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), 
which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. 
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Introduction 


The infraorder Anomura is a highly diverse group of decapod crustaceans, including seven 
superfamilies, 20 families, 335 genera and more than 2500 species in total, some of the 
king crab and squat lobster being economically important (Dawson 1989, Lovrich 1997, 
Poore et al. 2011). However, the phylogenetic relationships within Anomura remain 
controversial. Earlier, based on adult morphological characteristics, classifications often 
differed in high-level composition (Baba 2008). Recently, more and more molecular and 
morphological data have been used to reconstruct the phylogeny of Anomura (Schnabel 
and Ahyong 2010, Kim et al. 2013, Gong et al. 2019). Although the monophyly of Anomura 
is widely accepted, phylogenetic relationships at high taxonomic levels remain unresolved, 
is dynamic and under continuous debate. Initially, the superfamily Galatheoidea was 
divided into seven families (i.e. Galatheidae, Munididae, Munidopsidae, Porcellanidae, 
Aeglidae, Chirostylidae and Kiwaidae) (Macpherson et al. 2005, Schnabel et al. 2011). 
Subsequently, Chirostylidae and Kiwaidae were removed to superfamily Chirostylidea, 
while Aeglidae was removed to Aegloidea (Perez-Losada et al. 2002, McLaughlin et al. 
2007). The current classification scheme divides Galatheoidea into Galatheidae (squat 
lobsters), Munididae, Munidopsidae and Porcellanidae (porcelain crabs) (Ahyong et al. 
2010). So far, the phylogenetic relationship of some families in Anomura is still unclear. 
Therefore, data from sufficient species are required for a comprehensive phylogenetic 
analysis of the infraorder Anomura. 


The porcelain crab (Pisidia serratifrons) is one of the marine crustaceans that live in 
shallow waters less than 200 metres, with various habitats, which belong to the subphylum 
Crustacea, order Decapoda, infraorder Anomura, family Porcellanidae, genus Pisidia (Kim 
and Ko 2011). P. serratifrons is mainly distributed in the southern Korea, southern Japan 
and the southeast coastal region of China (Morton 1997, Qing et al. 2016). So far, most 
studies of this species have focused on morphology and growth (Morton 1997, Kim and Ko 
2011). 


The mitogenome of metazoans is usually 14—20 kb in size and encoded with a set of 37 
genes, including 13 protein-coding genes (PCGs) (cox1-3, cob, nad1-6, nad4L, atp6 and 
atp8), two ribosomal RNA genes (rns and rrn/), 22 transport RNA genes (tRNAs) and an 
Al-rich region (also called D-loop region, CR) which contains some initiation sites for 
transcription and replication of the genome (Smith and Smith 2002, Sato and Sato 2013). 
Due to its haploid properties, matrilineal inheritance and rapid evolutionary rate, the 
mitogenome is increasingly being used in re-arrangement trends and phylogenetic 
analyses. With the rapid development of sequencing technology, more and more complete 
mitogenome sequences have been used in comparative genomics, molecular evolution 
and phylogeny (Tan et al. 2019). 
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Gene re-arrangements in the Anomura mitogenomes are relatively common (Arndt and 
Smith 1998, Hickerson and Cunningham 2000). As the sequence of animal mitogenomes 
remains stable over a long period of time and a complex shared derivative gene sequence 
is unlikely to appear independently in different pedigrees, gene re-arrangements can be 
used as an indicator to clarify the interspecific relationship. So far, several hypotheses 
have been suggested to help explain gene re-arrangements in animal mitogenomes. The 
recombination model and tandem duplication/fandom loss (TDRL) model are more 
commonly accepted. Recombination models are involved in the breaking and reconnecting 
of DNA strands. The TDRL model assumes that the re-arranged gene order occurs via 
tandem duplications followed by random deletion of certain duplications (Moritz and Brown 
1987). This model has been widely used to explain the translocation of genes encoded on 
the same strand (Posada and Crandall 1998). 


In this study, we successfully sequenced the complete mitogenome of FP serratifrons. In 
addition, the gene structure and gene re-arrangement of the mitogenome of P. serratifrons 
have been reported and a phylogenetic analysis of 31 Anomura species has been 
conducted, based on the nucleotide sequences of 13 PCGs. Based on the similarities and 
differences of the gene re-arrangement order in the mitogenome, the possible re- 
arrangement process was discussed in order to have a better understanding of the re- 
arrangement events and evolutionary mechanisms of the Anomura mitogenome. 


Materials and methods 
Sampling and DNA extraction 


A specimen of FP. serratifrons was collected from Zhoushan, Zhejiang Province, China 
(29°98'30N, 122°96'99"E). The specimen was immediately preserved in absolute ethanol 
after collection and then stored at —-20°C. This specimen was identified by morphology and 
fresh tissues were dissected from the operculum and preserved in absolute ethanol before 
DNA extraction. The total genomic DNA was extracted using the salt-extraction procedure 
(Aljanabi and Martinez 1997) with a slight modification and stored at -—20°C. 


Genome sequencing, assembly and annotation 


The mitogenomes of P serratifrons was sequenced by Origin gene Co. Ltd., Shanghai, 
China and was sequenced on the Illumina HiSeq X Ten platform. HiSeq X Ten libraries with 
an insert size of 300-500 bp were generated from the genomic DNA. About 10 Gb of raw 
data were generated for each library. Low-quality reads, adapters and sequences with high 
“N” ratios and length less than 25 bp were removed. The clean reads were assembled 
using the software NOVOPlasty (Dierckxsens et al. 2017) (https://github.com/ndierckx/ 
NOVOPIlasty) and annotated and manually corrected on the basis of the complete 
mitogenome sets, assembled de novo by using MITOS tools (Bernt et al. 2012) (MITOS 
Web Server (uni-leipzig.de)). To confirm the correct sequences, we compared the 
assembled mitochondrial genes with those of other Porcellanidae species and identified 
the mitogenomic sequences by checking the cox? barcode sequence with NCBI BLAST 
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(Altschul et al. 1997). The abnormal start and stop codons were determined by comparing 
them with the start and stop codons of other marine crustacea. Then, the reads were 
reconstructed using the de novo assembly programme. The complete mtDNA was 
annotated using the software Sequin version 16.0. The mitogenome map of the P 
serratifrons was drawn using the online tool Poksee (httos://proksee.ca) (Grant and 
Stothard 2008). The secondary structures predicted of the tRNA genes were plotted by 
using MITOS Web Server. The relative synonymous codon usage (RSCU) values and 
Substitution saturation for the 13 PCGs were calculated by DAMBE 5 and analysed with 
MEGA 7 (Kumar et al. 2016). The GC-skews and AT-skews were used to determine the 
base compositional difference and strand asymmetry amongst the samples. According to 
the following formulae, Composition skew values were calculated: AT-skew = [A-T]/[A+T] 
and GC skew = [G-C]/[G+C]. Substitution saturation for the 13 PCGs was calculated by 
DAMBE 5 (Xia and Xie 2001). 


Phylogenetic analysis 


The phylogenetic relationship within Anomura was reconstructed using the sequences of 
the 13 PCGs of a total of 34 complete mitogenome sequences downloaded from the 
GenBank database (https:/Mwww.ncbi.nim.nih.gov/genbank/) and adding two species of 
Ocypodea to serve as the outgroup (Suppl. material 2). The phylogenetic relationships 
were analysed with Maximum Likelihood (ML) by using IQ-TREE 1.6.2 and Bayesian 
Inference (BI) methods in MrBayes 3.2 version programme (Perna and Kocher 1995, 
Huelsenbeck and Ronquist 2001, Nguyen et al. 2015). The ML analysis was inferred with 
1000 ultrafast likelihood bootstrap replicates by using IQ-TREE 1.6.2. The best-fit model 
for each partition was K3Pu+f+R4, selected according to the Bayesian Information 
Criterion (BIC). BI was performed in MrBayes 3.2 and the best-fit evolutionary models were 
determined using MrMTgui (Ronquist et al. 2012). MrMTgui was used to associate PAUP, 
ModelTest and MrModelTest across platforms. MrBayes settings for the best-fit model 
(GTR + | + G) were selected by AIC in MrModelTest 2.3 (Nylander et al. 2004). The 
Bayesian phylogenetic analyses were performed using the parameter values estimated 
with the commands in MrModelTest or ModelTest (nst = 6, rates = invgamma) (Posada and 
Crandall 1998). Each with three hot chains and one cold chain were run simultaneously 
twice by Markov Chain Monte Carlo (MCMC) sampling and the posterior distribution was 
estimated. The MCMC chains were set for 2,000,000 generations and sampled every 1000 
steps, with a relative burn-in of 25%. The convergence of the independent runs was 
evaluated by the mean standard deviation of the split frequencies (< 0.01). The 
phylogenetic trees were visualised and edited using Figure Tree v.1.4.3 software (Rambaut 
2017). 


Divergence time estimation 


The divergence times of Anomura were estimated with the programme BEAST v.1.10.4 ( 
Joseph and Drummond 2011) under the uncorrelated strict clock model and fossil 
calibration points were used (Suppl. material 3), including with a normal prior distribution. 
The HKY substitution model was selected using based on BEAUti software and the Yule 
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speciation process model. This study ran four independent Markov Chain Monte Carlo 
(MCMC) algorithms, the chain length of Markov Chain setting is 800,000,000 generations 
and sampled every 8000 generations. The first 10% of the trees were discarded as burn-in 
and each parameter was examined by the effective sample size (ESS) (> 200, as 
recommended) in Tracer v.1.6. Trees were assessed using TreeAnnotator and a 
chronogram was constructed in FigTree v.1.4.2. 


Results and discussion 
Genome structure and composition 


The complete mitogenome sequence of P. serratifrons is a typical closed-circular molecule 
of 15 , 344 bp in size (GenBank accession number OM461359), which is a similar length 
to the published Porcellanidae mitogenomes (Tan et al. 2014, Lee et al. 2016), ranging 
from 15,324 to 15,348bp (Suppl. material 2). The mitogenome contents (Table 2) of P 
serratifrons is the same as most published Anomura (Hickerson and Cunningham 2000, 
Yang et al. 2008, Kim et al. 2013), including 37 genes, 13PCGs, 22 tRNAs and two rRNA 
(rrni and rrns), as well as a brief non-coding region. All the genes were identified and 
shown in Fig. 1 and Table 1. Most of the 37 genes are located on the heavy (H-) strand, 
except four PCGs (i.e. nad5, nad4, nad4/ and nad7), eight tRNAs (i.e. fRNA-Phe, His, Pro, 
Leu, Val, Gin, Cys and Tyr) and two rRNA which are located on the light (L-) strand (Fig. 1). 
There are seven regions with overlap in the total P serratifrons mitogenome, with one of 
them more than 10 bp trnL7 (23 bp) and the other six shorter than 10 bp nad4/atp8 (7 bp), 
cox7 (5 bp), cob (2 bp) and tmF/atp6 (1 bp) (Table 1). The P. serratifrons mitogenome also 
contains 376 bp of intergenic spacers located in 20 regions, ranging from 1 to 57 bp (Table 
1) and indicating the occurrence of tandem duplications and the deletions of redundant 
genes. The nucleotide composition of the P. serratifrons mitogenome is A, 37.78%, T, 
36.51%, G, 9.7% and C, 16.01%, with a high AT bias. The A + T (%) content of the 
mitogenomes was 74.29%. The Al-skew and GC-skew values are calculated for the 
chosen complete mitogenomes (Table 2). AT-skew of the P. serratifrons mitogenome is 
positive (0.017) and GC-skew of the FP serratifrons mitogenome is negative (—0.246), 
informing Ts and Cs are more abundant than Ts and Gs. 


Table 1. 


Features of the mitochondrial genome of FP. serratifrons. 


Gene _ _ Position length Aminoacid  Start/stop codon anticodon Intergenicregion — strand 
from to 

coxt1 1 1533 1533 510 ACG/TAA -5 H 

trnL2 1529 1592 64 TAA 3 H 

cox2 1596 2280 685 228 ATG/T(AA) 0 H 

trnK 2281 2351 71 TTT 3 H 

trnG 2355 2421 67 TCC 0 H 


Gene 


nad3 
trnA 
trnF 
nad5 
trnH 
nad4 
nad4l 
trnT 
nad6 
cob 
trnS2 
trnP 
nad1 
trnL1 
rene 
trnV 
rns 
CR 
trnM 
trnl 
nad2 
trnD 
atp8 
atp6 
cox3 
trnR 
trnN 
trnS1 
trnE 
trnW 
trnQ 
trnC 
trnY 


Position 
from 
2422 
2796 
2866 
2929 
4660 
4728 
6062 
6375 
6489 
6986 
8121 
8198 
8273 
9233 
9276 
10612 
10687 
11463 
11834 
11939 
12060 
13056 
13123 
13275 
13949 
14746 
14810 
14876 
14941 
15014 
15099 
15182 
15258 


to 
2772 
2862 
2929 
4641 
4725 
6068 
6343 
6443 
6980 
8122 
8190 
8264 
9202 
9298 
10578 
10685 
11462 
11834 
11901 
12002 
13055 
13122 
13281 
13949 
14740 
14809 
14875 
14940 
15010 
15083 
15165 
15245 
15324 


length 


351 
67 
64 
1713 
66 
1341 
282 
69 
492 
1137 
70 
67 
930 
66 
1303 
74 
776 
371 
68 
64 
996 
67 
159 
675 
792 
64 
66 
65 
70 
70 
67 
64 
67 


Amino acid 


116 


570 


446 
93 


163 


378 


309 


331 


52 
224 
263 
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Start/stop codon 


ATT/TAA 


ATG/TAA 


ATG/TAA 
ATT/TAA 


ATT/TAA 


TTG/TAA 


ATA/TAG 


ATT/TAA 


ATG/TAA 
ATG/TAA 
ATG/TAA 


anticodon 


TGC 
GAA 


GTG 


TGT 


TGA 
TGG 


TAG 


TAC 


CAT 
GAT 


GAT 


TCG 
GTT 
TCT 
TTC 
TCA 
TTG 
GCT 
GTA 


Intergenic region 


23 


37 
57 


oe © eee 2 eee | 


15 


12 


strand 
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Table 2. 


Composition and skewness of PF. serratifrons mitogenome. 


A% T% G% C% (AT)% AT-skew GC-skew Length (bp) 

Mitogenome 37.78 36.51 9.7 16.01 74.29 0.017 -0.246 15344 
PCGs 29.72 42.98 13.79 13.51 72.70 -0.182 0.011 11077 
cox 29.29 38.55 15.46 16.70 67.84 -0.137 -0.039 1533 
cox2 34.26 36.35 12.12 17.37 70.51 -0.031 -0.178 685 
atp8 41.51 42.77 6.92 8.81 84.28 -0.015 -0.120 159 
atp6 30.96 41.63 11.26 16.15 72.59 -0.147 -0.178 675 
cox3 31.19 38.26 13.51 17.05 69.44 -0.102 -0.116 792 
nad3 31.34 45.3 10.26 13.11 76.64 -0.280 0.341 351 
nad5 28.51 46.90 15.60 8.99 75.42 -0.244 0.269 1680 
nad4 26.10 48.32 17.30 8.28 74.42 -0.299 0.353 1341 
nad4L 25.89 49.29 19.15 5.67 75.18 -0.311 0.543 282 
nad6 31.98 44.19 Fold 16.67 76.16 -0.160 -0.398 516 
cob 31.22 37.55 12.40 18.82 68.78 -0.092 -0.206 1137 
nad1 26.02 46.24 18.60 9.14 72.26 -0.280 0.341 930 
nad2 31.43 45.18 7.93 15.46 76.61 -0.180 -0.322 996 
tRNAs 40.15 36.83 12.80 10.22 76.98 0.025 0.374 1477 
rRNAs 39.83 37.90 15.30 6.97 77.73 -0.182 0.011 2079 
AT-rich 31.62 42.16 8.92 17.30 77.78 -0.143 -0.320 371 


PCGs and codon usage 


The initial and terminal codons of all PCGs of P serratifrons are listed in Table 3. P 
serratifrons has 13 PCGs in the typical order found in Anomura species, containing seven 
NADH dehydrogenase (nad1-nad6, nad4L), three cytochrome c-oxidases (cox 1-cox3), two 
ATPases (atp6, atp8) and cytochrome b (cob). The total length of the 13 PCGs is 11077 
bp. The length of the 13 PCGs ranges from 159 to 1680 bp. The average A+T content is 
72.7%, ranging from 67.84% (cox71) to 84.28% (atp8) (Table 1). The AT-skew and GC-skew 
are -—0.182 and 0.011, respectively (Table 3). All of the PCGs are initiated by the start 
codon AIN (ATT, ATG, ATA and ATC), except cox? (ACG) and cob (TTG) , which is 
consistent with the mitogenome of most invertebrate species (Kong et al. 2009, Lee et al. 
2016) . The majority of the PCGs are terminated with TAA, whereas nad7 uses TAG as the 
stop codon (Table 3). The most frequently used amino acid in P. serratifrons is Leu and the 
least common anion acid is Cys (Fig. 2). The relative synonymous codon usage (RSCU) 
values for P. serratifrons of the 13 PCGs are shown in Table 3 and Fig. 2. The three most 
frequently detected codons are CUU (Leu), whereas GCA (Gin) is the least common 
codon. Based on CDspT and RSCU, comparative analyses showed that the codon usage 
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pattern of P. serratifrons is conserved. The codon usage patterns of 13 PCGs are similar to 
those of other Porcellanidae species (Tan et al. 2014). 


mtRNA 
rRNA 

Mi mRNA 
MGC Skew+ 
WGC Skew- 


The complete mitogenome of \' 


~ 12 kbp 


Pisidia serratifrons 


Figure 1. EES] 


Circular mitogenome map of P. serratifrons. Protein coding, ribosomal and tRNA genes are 
shown with standard abbreviations. Arrows indicate the orientation of gene transcription. The 
inner circles show the G-C content and GC-skew, which are plotted as the deviation from the 
average value of the entire sequence. 


Table 3. 


The codon number and relative synonymous codon usage in the mitochondrial genome of P 
serratifrons. 


codon count RSCU codon count RSCU codon count RSCU_ codon count RSCU 
UUU(F) 407 1.527 UCU(S) 98 1.549 UAU(Y) 224 1.697  GAA(E) 28 1.333 
UUC(F) 126 0.473 UCC(S) 47 0.743 UAC(Y) 40 0.303 UGU(C) 73 1.315 
CUA(L) 9 0.308 UCA(S) 88 1.391 UGA(*) 72 1.049 UGC(C) 38 0.685 
CUC(L) 29 0.991 UCG(S) 20 0.316 UAG(*) 40 0.583 UGG(W) 63 1 


CUG(L) 4 0.137 CCU(P) 30 1.463 UAA(*) 94 1.369 CGU(R) 10 1.053 
CUU(L) 75 2.564 CCC(P) 13 0.634 CAU(H) 29 1.055 CGC(R) 7 0.737 
UUA(L) 31 1.344. CCA(P) 35 1.707 CAC(H) 26 0.945 CGA(R) 17 1.789 
UUG(L) 64 0.656 CCG(P) 4 0.195 CAA(Q) 29 1.706 CGG(R) 4 0.421 
AUU(l) 255 2.029 ACU(T) 55 1.467 CAG(Q) 5 0.294 AGA(R) 58 0.967 


AUC(I) 57 0.454 ACC(T) 37 0.987 AAU(N) 216 1.459 AGG® 62 1.033 
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codon count RSCU_ codon count RSCU_ codon count RSCU_ codon count RSCU 
AUA\(I) 65 0.517 ACA(T) 45 1.2 AAC(N) 80 0.541 AGU(S) 84 1.084 


AUG(M) 41 1 ACG(T) 13 0.347 AAA(K) 134 1.403 AGC(S) 71 0.916 
GUU(V) 71 2.185 GCU(A) 34 2 AAG(K) 57 0.597 GGU(G) 29 1.036 
GUC(V) 13 0.4 GCC(A) 9 0.529 GAU(D) 53 1.797 GGC(G) 18 0.643 
GUA(V) 29 0.892 GCA(A) 16 0.941 GAC(D) 6 0.203 GGA(G) 36 1.286 
GUG(V) 17 0.523 GCG(A) 9 0.529 GAG(E) 14 0.667 GGG(G) 29 1.036 


IOS 


Ldsqo 


Ala Arg Asn Asp Cys Glu Gin Gly H le Leu Lys MetPhe Pro Ser Thr Trp Tyr Va a ps aa Ss a 


(A) (B) 


Figure 2. EES] 


Codon usage patterns in the mitogenome of FP serratifrons CDspT, codons per thousand 
codons. Codon families are provided on the x-axis (A) and the relative synonymous codon 
usage (RSCU) (B). 


Transfer RNAs, ribosomal RNAs 


Like most Porcellanidae species, P. serratifrons mitogenome contains 22 tRNA genes (Lee 
et al. 2016). Fourteen of them are encoded by the heavy strain (H-) and the rest are 
encoded by the light strain (L-). In the whole mitogenome, the size of tRNAs ranges from 
64 to 70 bp and has a total length of 1477 bp, with an obvious AT bias (76.98%) (Table 2). 
The AT-skew and GC-skew are 0.043 and 0.111, respectively, showing a slight bias 
towards the use of As and an apparent bias toward Gs (Table 2). The trnS7 cannot form a 
secondary structure due to the lack of dihydrouracil (DHU) arms, while other tRNAs are 
capable of folding into a typical clover-leaf secondary structure (Fig. 3). The variation of 
trnS7 structure is consistent with the trnS7 structure reported in other invertebrate 
mitogenomes (Yang et al. 2008, Tan et al. 2019). The rms and rm are 776 and 1303 bp, 
respectively, which are typically separated by tRNA-Val (Table 1). These sizes are similar 
to those of other Porcellanidae species. The A-T content of rRNAs is 77.73%. The AT-skew 
and GC-skew are 0.025 and 0.374, respectively, suggesting a slight bias towards the use 
of As and an apparent bias toward Gs (Table 2). As in most typical mitogenomes of other 
crabs, CR is located between rmS and tRNA-Met. The 371 bp CR is obviously AT biased 
(77.63%). The AT-skew and GC-skew are -0.143 and —0.320, respectively, indicating an 
obvious bias towards the use of Ts and Cs. The index of substitution saturation (Iss) was 
measured as an implemention in DAMBE 5 and the GTR substitution model Iss is for the 
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combined dataset of all PCGs of the 31 Anomura mitogenomes and was signifcantly lower 
(Iss = 0.647) than the critical values (Iss, cSym = 0.879). The genes are not saturated, so 


the reconstructed phylogeny was reliable. 
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Figure 3. EESl 


Putative secondary structures of tRNAs from the P. serratifrons mitogenome. The tRNAs are 
labelled with the abbreviations of their corresponding amino acids. 


Gene re-arrangement 


Compared with the gene arrangement in the ancestral crustaceans (pancrustacean ground 
pattern), we found that the gene order in P. serratifrons mitogenome underwent a massive 
re-arrangement. As Fig. 4 shows, at least five gene clusters (or genes) are significantly 
different from the typical genes, involving eleven tRNA genes (D, G, A, R, N, S17, E, P|, Q 
and M) and four PCGs (atp8, ato6, cox3 and nad3) (Fig. 4). The re-arrangement of the five 
gene clusters (or genes) is as follows (Fig. 5): (1) The G-nad3-A gene cluster moved to 
downstream of K; (2) The D-atp8-atp6-cox3 gene cluster shift to downstream of nad2; (3) 
Four tRNA clusters (R-N-S1-E) shifted upstream of W; (4) The /-Q-M cluster was divided 
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into two sections, the /-Q-M cluster order was changed into /-/-Q and then a single Q was 
moved to downstream of W (5) A single P moved from the downstream of T to 
downstream of the So. 
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Gene re-arrangements in P. serratifrons mitogenome. Gene re-arrangement steps: A ancestral 
gene arrangement of crustaceans; B gene order in the P. serratifrons mitogenome. 
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C. Pisidia serratifrons 
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Inferred intermediate steps between the ancestral gene arrangement of crustaceans and P 
serratifrons mitogenome. A Duplication-loss and translocation in the ancestral mitogenome of 
crustaceans. The duplicated gene block is boxed in dash and the lost genes are labelled with 
grey B Translocation; C The final gene order in the P. serratifrons mitogenome. 
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At present, there are three models to explain the mitochondrial genome re-arrangement: 
(1) replication-random loss model (Moritz and Brown 1987); (2) duplication-non-random 
loss (Lavrov et al. 2002); (3) recombination (Rokas et al. 2003). Based on the 
mitochondrial sequence characteristics of P. serratifrons, we concluded that replication- 
random loss and recombination resulted in the generation of the re-arrangement 
phenomenon. Firstly, two gene clusters underwent a complete copy, forming two dimeric 
blocks, (D-atp8-atp6-cox3-G-nad3-A) - (D-atp8-atp6-cox3-G-nad3- A) and (/-Q-M) - (I-Q-M) 
(Fig. 5). Due to the parsimony of the mitochondrial genome, usually only one gene is 
active, while the other gene has lost its original function and evolution in the genome 
random loss of genes can occur along the way. This process can be shown as D-atp8-atp6 
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gene) with formation of two new gene blocks (G-nad3-A-D-atp8-atp6-cox3) and (M-/-Q). 
Tandem duplication followed by random loss is widely used to explain this type of 
translocation of mitochondrial genes (Kong et al. 2009, Shi et al. 2015, Sun et al. 2019). 
Therefore, we ascertain that the duplication-random loss model is the most likely 
explanation for these two gene block re-arrangements. Then, the two new gene blocks 
result in a translocation. (G-nad3-A-D-atp 8-atp6-cox3) block is translocated downstream to 
the nad2, leaving G-nad3-A in the original position. (M-/-Q) block is translocated to 
upstream of W, leaving M-/ in the original position. In the second step, four tRNA clusters 
(R-N-Sj-E) shifted to upstream of W. P is translocated to downstream of Sp». Finally, the 

ultimate gene arrangement of the P. serratifrons mitogenome is shown in Fig. 5C. 


Comparing mitochondrial gene order has been proved to be a valuable tool in crustacean 
phylogeny. Based on the comparative analysis of mitochondrial gene arrangement within 
Galatheoidea, we found that eight Galatheoidea mitogenomes showed a massive re- 
arrangement, which differs from any gene order ever reported in decapods (Fig. 6). 
Amongst the eight gene re-arrangement patterns in this study, the (F-nad5-H-nad4-nad4L) 
and (rmnL-V-rrnS) regions are extremely conserved, which is consistent with the conclusion 
of Shao et al. (2001) that the (F-nad5-H-nad4-nad4L) and (rnL-V-rrnS) regions are 
considered extremely conserved in animals. The FP serratifrons mitochondrial gene 
arrangement is closest to Neopetrolisthes maculatus and Petrolisthes haswelli which 
provides further support for the close relationship. The mitochondrial gene orders of 
Munida gregaria shared the most similarities with Munida isos, while Munidopsis Verrilli 
and Munidopsis lauensis shared higher similarities with Shinkaia crosnieri. These results 
are consistent with the conclusion from the gene order based phylogenetic tree. The gene 
order of the Munididae has a complex within-genus re-arrangement which seems to be 
related to their particular habitat. Our results support the fact that those comparisons of 
mitochondrial gene re-arrangements are a useful tool for phylogenetic studies. 
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Figure 6. EES] 


Mitochondrial gene arrangements of eight species in Galatheoidea. Gene arrangement of all 
genes are transcribed from left to right. The re-arranged gene blocks are underlined and 
compared with ancestral gene arrangement of Anomura. 
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Phylogenetic relationships 


In the present study, the phylogenetic relationships were analysed, based on the 
sequences of the 13 PCGs to clarify the relationships in Anomura. P. serratifrons and 
another 31 known Anomura species were analysed, with O. ceratophthalmus and Q. 
stimpsoni as outgroups. The two phylogenetic trees (i.e. Maximum Likelihood (ML) tree 
and Bayesian Inference (Bl) tree) resulted in identical topological structuring with different 
supporting value. Subsequently, only one topology (ML) with both support values was 
presented and displayed (Fig. 7). It was obvious that P. serratifronsa, N. maculatus and P 
haswelli formed a Porcellanidae clade with high support value. The families Munididae and 
Munidopsidae were grouped into one clade and Porcellanidaeas as the basal group which 
was similar to what was reported by McLaughlin et al. based on morphological characters 
and by Gong et al. based on the amino acid dataset of 13 PCGs (McLaughlin et al. 2007, 
Gong et al. 2019). 
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Figure 7. EE 


The phylogenetic tree was inferred from the nucleotide sequences of 13 mitogenome PCGs 
using BI and ML methods. Numbers on branches indicate posterior probability (Bl) and 
bootstrap support (ML). 


Amongst the 11 families included in our phylogenetic tree, each family in the tree forms a 
monophyletic clade with high nodal support values, except Paguridae. At a higher level of 


14 lu J et al 


classification, most superfamilies from Anomura were found to be monophyletic, except 
Paguroidea, which is in line with previous studies (McLaughlin 1983, Tan et al. 2018). It 
showed that Paguroidea was divided into two clades, ((Coenobitidae + Diogenidae) + 
(Lithodidae + Paguridae)), which is consistent with previous results (Tan et al. 2018, Gong 
et al. 2019), while Tan et al. (2019) deem that Lithodidae was excluded from Paguroidea 
and belonged to a new superfamily Lithodoidea. Besides, our phylogenetic tree showed 
that (Porcellanidae + (Munidopsidae + Munididae)) formed a Galatheoidea clade in this 
tree and (Chirostylidae+ Kiwaidae) formed Chirostylidea in a clade which was consistent 
with Sun et al. (2019) (based on morphological characters) and Schnabel et al. (2011) 
(based on mitochondrial 16S rRNA and nuclear 18S and 28S rRNA), while the monophyly 
of Galatheoidea is still not recognised by some studies, mainly due to the classification of 
Chirostylidae. According Tan et al. (2018), they regarded Chirostylidae as a member of the 
Galatheoidea and Galatheoidea formed a polyphyletic clade in their studies. 


Divergence time estimation 


The divergence time analysis, based on 13 PCGs of the mitochondrial genome, implies 
that the divergence of Anomura occurred in the early Triassic (~ 225.2 MYA, 95% credibility 
interval = 182.79-297.16 MYA, Fig. 8A), which is roughly the same as the conclusion of 
Bracken-Grissom et al. (2013) that the origin of Anomura is Late Permian ~ 259 (224-296) 
MYA, based on the divergence time analysis. The Galatheoidea superfamily diverged in 
the early Jurassic (208 Ma, 95% credibility interval = 167.73-215.52 MYA, Fig. 8B), into the 
Munidopsidae and Munididae during the Early Jurassic (~ 173 MYA, Fig. 8C), while the 
family Procellanidae diverged in the Early Jurassic (~ 187 MYA, Fig. 8D) with rapid 
speciation of present day species occurring since the mid-Miocene (~ 54 MYA, Fig. 8E). 
The Lomidae, Kiwaidae and Chirostylidae all originated in the Jurassic (~ 183.81 MYA, 
175.62 Ma and 158.48 Ma, respectively). The hermit crab formed two subclades during the 
Jurassic period (~ 191 MYA, Fig. 8 F), the first subclade branches being composed of 
Lithodidae and Paguridae. The most recent common ancestor of Lithodidae and Paguridae 
was divided into a new family in the Middle Tertiary (~ 39.84 MYA, Fig. 8G). The Paguridae 
was first discovered in the Tertiary (~ 29.5 MYA, Fig. 8H). The second subclade was 
formed by the hermit crabs in the middle Cretaceous (~ 60.3 MYA, Fig. 8l) and 
differentiation formed the family of Albuneidae, Coenobitidae and Diogenidae. The 
differentiation time was longer than that of the first subclade and appeared about 20 MYA 
earlier. The results support the multi-family origin of the hermit crab. 


Conclusion 


In this study, the mitogenome of P serratifrons was sequenced by next-generation 
sequencing, thereby generating new mitochondrial data for Porcellanidae. We analysed 
the mitogenome of P serratifrons and found it is similar to other Anomura with many 
significant features including AT-skew, a codon usage bias etc. Compared with the 
pancrustacean ground pattern, the gene order in FP. serratifrons mitogenome underwent a 
massive re-arrangement. The Galatheoidea showed eight re-arrangement patterns and 


Novel gene re-arrangement in the mitochondrial genome of Pisidia serratifrons ... 15 


their re-arrangement similarity is consistent with phylogenetic relationships. Our 
phylogenetic tree had similarities and disagreements with predecessor studies. The 
phylogenetic analyses indicated that P. serratifronsa, N. maculatus and P. haswelli formed 
a Porcellanidae clade. Divergence time estimation implies that the age of Anomura is over 
225 MYA, dating back to at least the late Triassic. Most of the extant superfamilies and 
families arose during the late Cretaceous to early Tertiary. These results provide insight 
into the gene arrangement features of Anomura mitogenomes and lay the foundation for 
further phylogenetic studies on Anomura. 
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Anomura divergence time estimated using the Bayesian relaxed-molecular clock method. The 
95% confidence intervals for each node are shown in light blue bars. 1-3: 3 fossil calibration 
nodes (Corresponding to Suppl. material 3). 
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